Data-driven Formant Synthesis of Speaker Age

نویسنده

  • Susanne Schötz
چکیده

This paper briefly describes the development of a research tool for analysis of speaker age using data-driven formant synthesis. A prototype system was developed to automatically extract 23 acoustic parameters from the Swedish word ‘själen’ [ˈɧɛːlən] (the soul) spoken by four differently aged female speakers of the same dialect and family, and to generate synthetic copies. Functions for parameter adjustment as well as audio-visual comparison of the natural and synthesised words using waveforms and spectrograms were added to improve the synthesised words. Age-weighted linear parameter interpolation was then used to synthesise a target age anywhere between the ages of 2 source speakers. After an initial evaluation, the system was further improved and extended. A second evaluation indicated that speaker age may be successfully synthesised using data-driven formant synthesis and weighted lienar interpolation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

F0 and Segment Duration in Formant Synthesis of Speaker Age

This paper describes the work with F0 and segment duration when developing a prototype system for analysis of speaker age using data-driven formant synthesis. The system was developed to extract 23 parameters from the test words—spoken by four differently aged female speakers of the same dialect and family—and to generate synthetic copies. Audio-visual feedback enabled the user to compare the n...

متن کامل

Exploring data driven parametric synthesis

This paper describes our work on building a formant synthesis system based on both rule generated and database driven methods. Three parametric synthesis systems are discussed: our traditional rule based system, a speaker adapted system, and finally a gesture system. The gesture system is a further development of the adapted system in that it includes concatenated formant gestures from a data-d...

متن کامل

Formant analysis and synthesis using hidden Markov models

This paper describes a unifying framework for both formant tracking and speech synthesis using Hidden Markov Models (HMM). The feature vector in the HMM is composed by the first three formant frequencies, their bandwidths and their delta with time. Speech is synthesized by generating the most likely sequence of feature vectors from a HMM, trained with a set of sentences from a given speaker. Hi...

متن کامل

Formant diphone parameter extraction utilising a labelled single-speaker database

This paper examines a method for formant parameter extraction from a labeled single speaker database for use in a formantparameter diphone-concatenation speech synthesis system. This procedure commences with an initial formant analysis of the labelled database, which is then used to obtain formant (F1-F5) probability spaces for each phoneme. These probability spaces guide a more careful speaker...

متن کامل

Towards synthesis of speaker age: A perceptual study with natural, synthesized and resynthesized stimuli

As a first step towards synthesis of speaker age the hypothesis that spectral cues may be more important for age perception than F0 and duration was tested in a pilot listening experiment with male speaker stimuli consisting of natural, synthesized and resynthesized isolated words. Results indicate that spectral information is dominant over pitch as cues for age. Slow speech rate also seems to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006